How to Combine Fast Heuristic Markov Chain Monte Carlo with Slow Exact Sampling

نویسندگان

  • ANTAR BANDYOPADHYAY
  • DAVID J. ALDOUS
چکیده

Given a probability law π on a set S and a function g : S → R, suppose one wants to estimate the mean ḡ = ∫ g dπ. The Markov Chain Monte Carlo method consists of inventing and simulating a Markov chain with stationary distribution π. Typically one has no a priori bounds on the chain’s mixing time, so even if simulations suggest rapid mixing one cannot infer rigorous confidence intervals for ḡ. But suppose there is also a separate method which (slowly) gives samples exactly from π. Using n exact samples, one could immediately get a confidence interval of length O(n−1/2). But one can do better. Use each exact sample as the initial state of a Markov chain, and run each of these n chains for m steps. We show how to construct confidence intervals which are always valid, and which, if the (unknown) relaxation time of the chain is sufficiently small relative to m/n, have length O(n−1 log n) with high probability. 1 Background Let π be a given probability distribution on a set S. Given a function g : S → R, we want to estimate its mean ḡ := ∫ S g(s)π(ds). As we learn in elementary statistics, one can obtain an estimate for ḡ by taking samples from π and using the sample average g-value as an estimator. But algorithms which sample exactly from π may be prohibitively slow. This is the setting for the Markov chain Monte Carlo (MCMC) method, classical in statistical physics and over the last ten years studied extensively as statistical methodology [4, 7, 9, 12]. In MCMC one designs a Markov chain on state-space S to have stationary distribution π. Then the sample average g-value over a long run of the chain is a heuristic estimator of ḡ. Diagnostics for assessing 1This material is based upon work supported by the National Science Foundation under Grant No. 9970901

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Generalizing Elliptical Slice Sampling for Parallel MCMC

Probabilistic models are conceptually powerful tools for finding structure in data, but their practical effectiveness is often limited by our ability to perform inference in them. Exact inference is frequently intractable, so approximate inference is often performed using Markov chain Monte Carlo (MCMC). To achieve the best possible results from MCMC, we want to efficiently simulate many steps ...

متن کامل

Parallel MCMC with generalized elliptical slice sampling

Probabilistic models are conceptually powerful tools for finding structure in data, but their practical effectiveness is often limited by our ability to perform inference in them. Exact inference is frequently intractable, so approximate inference is often performed using Markov chain Monte Carlo (MCMC). To achieve the best possible results from MCMC, we want to efficiently simulate many steps ...

متن کامل

Resampling Markov Chain Monte Carlo Algorithms: Basic Analysis and Empirical Comparisons

Sampling from complex distributions is an important but challenging topic in scientific and statistical computation. We synthesize three ideas, tempering, resampling, and Markov moving, and propose a general framework of resampling Markov chain Monte Carlo (MCMC). This framework not only accommodates various existing algorithms, including resample-move, importance resampling MCMC, and equi-ener...

متن کامل

Markov chain Monte Carlo methods for Dirichlet process hierarchical model

Inference for Dirichlet process hierarchical models is typically performed using Markov chain Monte Carlo methods, which can be roughly categorised into marginal and conditional methods. The former integrate out analytically the infinite-dimensional component of the hierarchical model and sample from the marginal distribution of the remaining variables using the Gibbs sampler. Conditional metho...

متن کامل

Efficient Monte Carlo Methods for Conditional Logistic Regression

Exact inference for the logistic regression model is based on generating the permutation distribution of the sufficient statistics for the regression parameters of interest conditional on the sufficient statistics for the remaining (nuisance) parameters. Despite the availability of fast numerical algorithms for the exact computations, there are numerous instances where a data set is too large t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001